49 research outputs found

    I/O-Efficient Dynamic Planar Range Skyline Queries

    Get PDF
    We present the first fully dynamic worst case I/O-efficient data structures that support planar orthogonal \textit{3-sided range skyline reporting queries} in \bigO (\log_{2B^\epsilon} n + \frac{t}{B^{1-\epsilon}}) I/Os and updates in \bigO (\log_{2B^\epsilon} n) I/Os, using \bigO (\frac{n}{B^{1-\epsilon}}) blocks of space, for nn input planar points, tt reported points, and parameter 0ϵ10 \leq \epsilon \leq 1. We obtain the result by extending Sundar's priority queues with attrition to support the operations \textsc{DeleteMin} and \textsc{CatenateAndAttrite} in \bigO (1) worst case I/Os, and in \bigO(1/B) amortized I/Os given that a constant number of blocks is already loaded in main memory. Finally, we show that any pointer-based static data structure that supports \textit{dominated maxima reporting queries}, namely the difficult special case of 4-sided skyline queries, in \bigO(\log^{\bigO(1)}n +t) worst case time must occupy Ω(nlognloglogn)\Omega(n \frac{\log n}{\log \log n}) space, by adapting a similar lower bounding argument for planar 4-sided range reporting queries.Comment: Submitted to SODA 201

    Longest Common Subsequence on Weighted Sequences

    Get PDF
    We consider the general problem of the Longest Common Subsequence (LCS) on weighted sequences. Weighted sequences are an extension of classical strings, where in each position every letter of the alphabet may occur with some probability. Previous results presented a PTAS and noticed that no FPTAS is possible unless P=NP. In this paper we essentially close the gap between upper and lower bounds by improving both. First of all, we provide an EPTAS for bounded alphabets (which is the most natural case), and prove that there does not exist any EPTAS for unbounded alphabets unless FPT=W[1]. Furthermore, under the Exponential Time Hypothesis, we provide a lower bound which shows that no significantly better PTAS can exist for unbounded alphabets. As a side note, we prove that it is sufficient to work with only one threshold in the general variant of the problem

    A Space-Optimal Hidden Surface Removal Algorithm for Iso-Oriented Rectangles

    Full text link
    We investigate the problem of finding the visible pieces of a scene of objects from a specified viewpoint. In particular, we are interested in the design of an efficient hidden surface removal algorithm for a scene comprised of iso-oriented rectangles. We propose an algorithm where given a set of nn iso-oriented rectangles we report all visible surfaces in O((n+k)logn)O((n+k)\log n) time and linear space, where kk is the number of surfaces reported. The previous best result by Bern, has the same time complexity but uses O(nlogn)O(n\log n) space

    I/O-Efficient Planar Range Skyline and Attrition Priority Queues

    Full text link
    In the planar range skyline reporting problem, we store a set P of n 2D points in a structure such that, given a query rectangle Q = [a_1, a_2] x [b_1, b_2], the maxima (a.k.a. skyline) of P \cap Q can be reported efficiently. The query is 3-sided if an edge of Q is grounded, giving rise to two variants: top-open (b_2 = \infty) and left-open (a_1 = -\infty) queries. All our results are in external memory under the O(n/B) space budget, for both the static and dynamic settings: * For static P, we give structures that answer top-open queries in O(log_B n + k/B), O(loglog_B U + k/B), and O(1 + k/B) I/Os when the universe is R^2, a U x U grid, and a rank space grid [O(n)]^2, respectively (where k is the number of reported points). The query complexity is optimal in all cases. * We show that the left-open case is harder, such that any linear-size structure must incur \Omega((n/B)^e + k/B) I/Os for a query. We show that this case is as difficult as the general 4-sided queries, for which we give a static structure with the optimal query cost O((n/B)^e + k/B). * We give a dynamic structure that supports top-open queries in O(log_2B^e (n/B) + k/B^1-e) I/Os, and updates in O(log_2B^e (n/B)) I/Os, for any e satisfying 0 \le e \le 1. This leads to a dynamic structure for 4-sided queries with optimal query cost O((n/B)^e + k/B), and amortized update cost O(log (n/B)). As a contribution of independent interest, we propose an I/O-efficient version of the fundamental structure priority queue with attrition (PQA). Our PQA supports FindMin, DeleteMin, and InsertAndAttrite all in O(1) worst case I/Os, and O(1/B) amortized I/Os per operation. We also add the new CatenateAndAttrite operation that catenates two PQAs in O(1) worst case and O(1/B) amortized I/Os. This operation is a non-trivial extension to the classic PQA of Sundar, even in internal memory.Comment: Appeared at PODS 2013, New York, 19 pages, 10 figures. arXiv admin note: text overlap with arXiv:1208.4511, arXiv:1207.234

    Investigation of Database Models for Evolving Graphs

    Get PDF
    We deal with the efficient implementation of storage models for time-varying graphs. To this end, we present an improved approach for the HiNode vertex-centric model based on MongoDB. This approach, apart from its inherent space optimality, exhibits significant improvements in global query execution times, which is the most challenging query type for entity-centric approaches. Not only significant speedups are achieved but more expensive queries can be executed as well, when compared to an implementation based on Cassandra due to the capability to exploit indices to a larger extent and benefit from in-database query processing

    Continuous Outlier Mining of Streaming Data in Flink

    Get PDF
    In this work, we focus on distance-based outliers in a metric space, where the status of an entity as to whether it is an outlier is based on the number of other entities in its neighborhood. In recent years, several solutions have tackled the problem of distance-based outliers in data streams, where outliers must be mined continuously as new elements become available. An interesting research problem is to combine the streaming environment with massively parallel systems to provide scalable streambased algorithms. However, none of the previously proposed techniques refer to a massively parallel setting. Our proposal fills this gap and investigates the challenges in transferring state-of-the-art techniques to Apache Flink, a modern platform for intensive streaming analytics. We thoroughly present the technical challenges encountered and the alternatives that may be applied. We show speed-ups of up to 117 (resp. 2076) times over a naive parallel (resp. non-parallel) solution in Flink, by using just an ordinary four-core machine and a real-world dataset. When moving to a three-machine cluster, due to less contention, we manage to achieve both better scalability in terms of the window slide size and the data dimensionality, and even higher speed-ups, e.g., by a factor of 510. Overall, our results demonstrate that oulier mining can be achieved in an efficient and scalable manner. The resulting techniques have been made publicly available as open-source software

    Threshold-Based Network Structural Dynamics

    Get PDF
    The interest in dynamic processes on networks is steadily rising in recent years. In this paper, we consider the (α,β)(\alpha,\beta)-Thresholded Network Dynamics ((α,β)(\alpha,\beta)-Dynamics), where αβ\alpha\leq \beta, in which only structural dynamics (dynamics of the network) are allowed, guided by local thresholding rules executed in each node. In particular, in each discrete round tt, each pair of nodes uu and vv that are allowed to communicate by the scheduler, computes a value E(u,v)\mathcal{E}(u,v) (the potential of the pair) as a function of the local structure of the network at round tt around the two nodes. If E(u,v)<α\mathcal{E}(u,v) < \alpha then the link (if it exists) between uu and vv is removed; if αE(u,v)<β\alpha \leq \mathcal{E}(u,v) < \beta then an existing link among uu and vv is maintained; if βE(u,v)\beta \leq \mathcal{E}(u,v) then a link between uu and vv is established if not already present. The microscopic structure of (α,β)(\alpha,\beta)-Dynamics appears to be simple, so that we are able to rigorously argue about it, but still flexible, so that we are able to design meaningful microscopic local rules that give rise to interesting macroscopic behaviors. Our goals are the following: a) to investigate the properties of the (α,β)(\alpha,\beta)-Thresholded Network Dynamics and b) to show that (α,β)(\alpha,\beta)-Dynamics is expressive enough to solve complex problems on networks. Our contribution in these directions is twofold. We rigorously exhibit the claim about the expressiveness of (α,β)(\alpha,\beta)-Dynamics, both by designing a simple protocol that provably computes the kk-core of the network as well as by showing that (α,β)(\alpha,\beta)-Dynamics is in fact Turing-Complete. Second and most important, we construct general tools for proving stabilization that work for a subclass of (α,β)(\alpha,\beta)-Dynamics and prove speed of convergence in a restricted setting.Comment: 29 pages, extension of the Post-print containing all proofs, to appear in SIROCCO 202
    corecore